Pii: S0303-2647(01)00144-7

نویسنده

Ugur Halici

چکیده

The reinforcement learning scheme proposed in Halici (J. Biosystems 40 (1997) 83) for the random neural network (RNN) (Neural Computation 1 (1989) 502) is based on reward and performs well for stationary environments. However, when the environment is not stationary it suffers from getting stuck to the previously learned action and extinction is not possible. To overcome the problem, the reinforcement scheme is extended in Halici (Eur. J. Oper. Res., 126(2000) 288) by introducing a new weight update rule (E-rule) which takes into consideration the internal expectation of reinforcement. Although the E-rule is proposed for the RNN, it can be used for training learning automata or other intelligent systems based on reinforcement learning. This paper looks into the behavior of the learning scheme with internal expectation for the environments where the reinforcement is obtained after a sequence of cascaded decisions. The simulation results have shown that the RNN learns well and extinction is possible even for the cases with several decision steps and with hundreds of possible decision paths. © 2001 Elsevier Science Ireland Ltd. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Pii: S0303-2647(00)00092-7

This paper deals with the succession process of a food web model consisting of one herbivore, two autotrophs and available nutrient in the environment in a closed nutrient flux. The model provides a way of describing successional changes in the form of species replacement with increasing nutrient levels. It is shown that distinct threshold (with upper and lower) values of nutrient are required ...

متن کامل

Pii: S0303-2647(00)00108-8

We simulated the diffusion of glutamate, following the release of a single vesicle from a pre-synaptic terminal, in the synaptic cleft by using a Brownian diffusion model based on Langevin equations. The synaptic concentration time course and the time course of quantal excitatory post-synaptic current have been analyzed. The results showed that they depend on the number of receptors located at ...

متن کامل

Pii: S0303-2647(00)00080-0

A system of n asexual populations is considered where both intraand interspecific frequency-dependent game conflicts with lack of information take place. The concept of a strict n-species ESS is introduced which implies local asymptotic stability of the replicator dynamics of pure phenotypes. The dynamical concept of strict stability is also introduced which turns out to be equivalent to the st...

متن کامل

Pii: S0303-2647(00)00137-4

We present a computational model of a transiently-organized neural membrane molecular system with possible information-processing capacity. The model examines field-induced dipole and quadrupole moments and polarizability in monomeric, dimeric, and trimeric ethenes. Polarization of the ethenes is strongly indicated. This result is interpreted as a significant electronic feature of a molecular c...

متن کامل

Pii: S0303-2647(01)00141-1

Biological photoreceptors transduce and communicate information about visual stimuli to other neurons through a series of signal transformations among physical states such as concentration of a chemical species, current, or the number of open ion channels. We present a communication channel model to quantify the transmission and degradation of visual information in the blowfly photoreceptor cel...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Pii: S0303-2647(01)00144-7

نویسنده

چکیده

منابع مشابه

Pii: S0303-2647(00)00092-7

Pii: S0303-2647(00)00108-8

Pii: S0303-2647(00)00080-0

Pii: S0303-2647(00)00137-4

Pii: S0303-2647(01)00141-1

عنوان ژورنال:

اشتراک گذاری